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FIG. 2 

The Greedy CaRT Selection Algorithm 



procedure Greedy (T(X) , e, G, 0) 

Input: n-attribute table T and n-vector of error tolerances e; 
Bayesian network G on the set of attributes X and 
threshold G on the relative benefit for selecting a 
CaRT predictor. 

Output: A set of materialized (predicted) attributes X ma t(Xp re( j 

= X - X ma t) and a CaRT predictor for each Xj £ X prec |. 

begin 

1 < x mat : = x pred := * 

2. let < Xj, X2,...,X n > be the attributes in X sorted in 
topological order of G 

3. for i :=1.,...,n 

4. if n (Xj) = * then X mat :=X mat u |X|| /• Xj must be 
materialized if it has no parents in G */ 

5. else 

6. M := BuildCaRT (X ma t Xj , ej) 

7. if (MaterCost (Xj) / PredCost (X mat — Xj) > B) then X pred := 
Xpred u jXjj 

8. else X ma t := X ma t u {Xjj 

9. end 

10. end 
end 
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FIG. 3A 
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FIG. 3C 
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FIG. 4 

The MaxIndependentSet CaRT Selection Algorithm 



procedure MaxIndependentSet (T(X) , e, G, neighborhood() ) 
Input: n-attribute table T and n-vector of error tolerances e; 

Bayesian network G on the set of attributes X and function 

neighborhood () defining the "predictive neighborhood" of a 

node Xj in G (e.g., n (Xj) or jff (Xj)). 
Output: A set of materialized (predicted) attributes X ma t(Xp re( j = X - 

x mat) and a CaRT Predictor PRED (Xj) — Xj for each Xj e X pred . 

begin 

^mat := X, Xp re( j := $ 

2. PRED (Xj) := $ for all Xj e X, improve := true 

3. while (improve f false) do 



4. for each Xj e X ma t 

5. mater_neighbors (Xj) := 

(X ma tn neighborhood^} ))u {PRED (X) : X e neighborhood 
(Xj), Xe Xp r ed }-|X|} 

6. M := BuildCaRT (Mater_neighbors (X § ) — — X j , ej) 

7. let PRED (X;)S mater.neighbors (Xj) be the set of 
predictor attributes used in M 

8. cost_changej :=0 

9. for each Xj e X pre( j such that Xj £ PRED (Xj) 

10. NEW_PREDj (Xj) := PRED(Xj)-jXj j uPRED (Xj) 

11. M :=BuildCaRT (NEW_PREDj(Xj)— Xj, ej 

12. set NEW. PRED j (Xj) to the (sub) set of 

predictor attributes used in M 

13. cosLchange, := cosLchange; + (PredCost (PRED 

(Xj) — Xj) - PredCost (NEVLPRED; (Xj) — Xj)) 

14. end 

15. end 
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FIG. 4 (cont) 



16. build an undirected, node-weighted graph Gt emp = (X ma (;, 
Etemp) on ^ e curren t set of materialized 

17. attributes X ma t, where: 

18. (a) E temp := | (X.Y) : for all pairs X, Y eX pred j u 

19. | (X,, Y) : for all Ye X mat j 

20. (b) weight (Xj) := MaterCost (Xj) -PredCost (PRED(Xj) 
-►Xj) +cost_changej for each X{ £ X ma t 

21. S := FindWMIS (Gt em p) /* select (approximate) maximum 
weight independent set in G^mp 

22. as "maximum-benefit" subset of 

predicted attributes */ 

23. if (S X6S weight (X) < 0) then improve := false 

24. else/* update X ma ^, X prec j , and the chosen CaRT predictors */ 

25. for each Xj G X pre( j 

26. if (PRED (Xj) n S = {X||) then PRED (Xj) := 
NEVLPREDj (Xj) 

27. end 

28- Xmat := X ma ^- S, X pre( j := X pre( j u S 

29. end 



30. end /* while */ 
end 
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FIG. 5 

Algorithm for Estimating Lower Bound on Subtree Cost 
procedure LowerBound (N, e;, b) 

Input: Leaf N for which lower bound on subtree cost is to be 
computed; error tolerance e; for attribute X;; bound b 
on the maximum number of internal nodes in subtree 
rooted at N. 

Output: Lower bound L(N) on cost of subtree rooted at N. 
begin 

1. for i := to r 

2. minOut [i, 0] :=i 

3. for J := 1 to b + 1 

4. minOut [0, j] :=0 

5. 1 :=0 

6. for i := 1 to r 

7. while Xj - x j + j > 2 e j 

8. 1 :=1 = 1 

9. for j := 1 to b + 1 

10. minOut [i,j] := min |min0ut[i - 1,j] + 1, minOut [l,j-l] 

11. end 

12. L(N) := oo 

13. for J := 0 to b 

14. L(N) :=min |L(N) , 2j + 1 + j log (|Xj|) + (j + 1 + minOut 
(r, j+1)) log (|dom(Xj)|)j 

15. L(N) := min |L(N) , 2b + 3 + (b + 1) log (|Xj|) + (b +2) log 
(|dom(Xi)|)l 

16. return L (N) 
end 
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FIG. 6 
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FIG. 7A 
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FIG. 7B 
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